智能论文笔记

Physics to the Rescue: Deep Non-line-of-sight Reconstruction for High-speed Imaging

Fangzhou Mu , Sicheng Mo , Jiayong Peng , Xiaochun Liu , Ji Hyun Nam , Siddeshwar Raghavan , Andreas Velten , Yin Li

分类：计算机视觉

2022-05-03

由于成像硬件和重建算法的重大进展，计算成像拐角处或非视线（NLOS）成像的方法正在成为现实。 NAM等人的最新发展NLOS成像。展示了一个高速非焦距成像系统，其运行速度为5Hz，比以前的ART快100倍。然而，这种巨大的采集率增长需要在光传输中进行大量近似，打破了许多现有的NLOS重建方法，这些方法采用了理想化的图像形成模型。为了弥合差距，我们提出了一个新颖的深层模型，该模型结合了波传播和体积渲染的互补物理学先验，以进行高质量和强大的NLOS重建。该精心策划的设计通过放松图像形成模型来规范解决方案空间，从而产生了一个深层模型，尽管在合成数据上只接受了专门的培训，但在真实捕获上却很好地概括了。此外，我们设计了一个统一的学习框架，使我们的模型能够使用各种监督信号（包括目标强度图像甚至RAW NLOS瞬态测量）灵活训练我们的模型。一旦受过训练，我们的模型就会在一次前传球中的推理时间呈现强度和深度图像，能够在高端GPU上处理超过5个以上的捕获。通过广泛的定性和定量实验，我们表明我们的方法的表现优于先前的物理和基于学习的方法，同时基于合成和实际测量。我们预计，我们的方法以及快速捕获系统将加速NLOS成像的未来开发，用于需要高速成像的现实世界应用。

translated by 谷歌翻译

Unpaired Overwater Image Defogging Using Prior Map Guided CycleGAN

Yaozong Mo , Chaofeng Li , Wenqi Ren , Shaopeng Shang , Wenwu Wang , Xiao-jun Wu

分类：计算机视觉 | 人工智能

2022-12-23

Deep learning-based methods have achieved significant performance for image defogging. However, existing methods are mainly developed for land scenes and perform poorly when dealing with overwater foggy images, since overwater scenes typically contain large expanses of sky and water. In this work, we propose a Prior map Guided CycleGAN (PG-CycleGAN) for defogging of images with overwater scenes. To promote the recovery of the objects on water in the image, two loss functions are exploited for the network where a prior map is designed to invert the dark channel and the min-max normalization is used to suppress the sky and emphasize objects. However, due to the unpaired training set, the network may learn an under-constrained domain mapping from foggy to fog-free image, leading to artifacts and loss of details. Thus, we propose an intuitive Upscaling Inception Module (UIM) and a Long-range Residual Coarse-to-fine framework (LRC) to mitigate this issue. Extensive experiments on qualitative and quantitative comparisons demonstrate that the proposed method outperforms the state-of-the-art supervised, semi-supervised, and unsupervised defogging approaches.

translated by 谷歌翻译

A Neural Network Warm-Start Approach for the Inverse Acoustic Obstacle Scattering Problem

Mo Zhou , Jiequn Han , Manas Rachh , Carlos Borges

分类：机器学习

2022-12-16

We consider the inverse acoustic obstacle problem for sound-soft star-shaped obstacles in two dimensions wherein the boundary of the obstacle is determined from measurements of the scattered field at a collection of receivers outside the object. One of the standard approaches for solving this problem is to reformulate it as an optimization problem: finding the boundary of the domain that minimizes the $L^2$ distance between computed values of the scattered field and the given measurement data. The optimization problem is computationally challenging since the local set of convexity shrinks with increasing frequency and results in an increasing number of local minima in the vicinity of the true solution. In many practical experimental settings, low frequency measurements are unavailable due to limitations of the experimental setup or the sensors used for measurement. Thus, obtaining a good initial guess for the optimization problem plays a vital role in this environment. We present a neural network warm-start approach for solving the inverse scattering problem, where an initial guess for the optimization problem is obtained using a trained neural network. We demonstrate the effectiveness of our method with several numerical examples. For high frequency problems, this approach outperforms traditional iterative methods such as Gauss-Newton initialized without any prior (i.e., initialized using a unit circle), or initialized using the solution of a direct method such as the linear sampling method. The algorithm remains robust to noise in the scattered field measurements and also converges to the true solution for limited aperture data. However, the number of training samples required to train the neural network scales exponentially in frequency and the complexity of the obstacles considered. We conclude with a discussion of this phenomenon and potential directions for future research.

translated by 谷歌翻译

Controllable Text Generation via Probability Density Estimation in the Latent Space

Yuxuan Gu , Xiaocheng Feng , Sicheng Ma , Lingyuan Zhang , Heng Gong , Bing Qin

分类：自然语言处理

2022-12-16

Previous work on controllable text generation has explored the idea of control from the latent space, such as optimizing a representation with attribute-related classifiers or sampling a representation from relevant discrete samples. However, they are not effective enough in modeling both the latent space and the control, leaving controlled text with low quality and diversity. In this work, we propose a novel control framework using probability density estimation in the latent space. Our method utilizes an invertible transformation function, the Normalizing Flow, that maps the complex distributions in the latent space to simple Gaussian distributions in the prior space. Thus, we can perform sophisticated and flexible control in the prior space and feed the control effects back into the latent space owing to the one-one-mapping property of invertible transformations. Experiments on single-attribute controls and multi-attribute control reveal that our method outperforms several strong baselines on attribute relevance and text quality and achieves the SOTA. Further analysis of control strength adjustment demonstrates the flexibility of our control strategy.

translated by 谷歌翻译

SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory

Sicheng Li , Hao Li , Yue Wang , Yiyi Liao , Lu Yu

分类：计算机视觉

2022-12-15

Neural Radiance Fields (NeRF) have demonstrated superior novel view synthesis performance but are slow at rendering. To speed up the volume rendering process, many acceleration methods have been proposed at the cost of large memory consumption. To push the frontier of the efficiency-memory trade-off, we explore a new perspective to accelerate NeRF rendering, leveraging a key fact that the viewpoint change is usually smooth and continuous in interactive viewpoint control. This allows us to leverage the information of preceding viewpoints to reduce the number of rendered pixels as well as the number of sampled points along the ray of the remaining pixels. In our pipeline, a low-resolution feature map is rendered first by volume rendering, then a lightweight 2D neural renderer is applied to generate the output image at target resolution leveraging the features of preceding and current frames. We show that the proposed method can achieve competitive rendering quality while reducing the rendering time with little memory overhead, enabling 30FPS at 1080P image resolution with a low memory footprint.

translated by 谷歌翻译

Faster Maximum Inner Product Search in High Dimensions

Mo Tiwari , Ryan Kang , Je-Yong Lee , Luke Lee , Chris Piech , Sebastian Thrun , Ilan Shomorony , Martin Jinye Zhang

分类：机器学习 | 人工智能

2022-12-14

Maximum Inner Product Search (MIPS) is a popular problem in the machine learning literature due to its applicability in a wide array of applications, such as recommender systems. In high-dimensional settings, however, MIPS queries can become computationally expensive as most existing solutions do not scale well with data dimensionality. In this work, we present a state-of-the-art algorithm for the MIPS problem in high dimensions, dubbed BanditMIPS. BanditMIPS is a randomized algorithm that borrows techniques from multi-armed bandits to reduce the MIPS problem to a best-arm identification problem. BanditMIPS reduces the complexity of state-of-the-art algorithms from $O(\sqrt{d})$ to $O(\text{log}d)$, where $d$ is the dimension of the problem data vectors. On high-dimensional real-world datasets, BanditMIPS runs approximately 12 times faster than existing approaches and returns the same solution. BanditMIPS requires no preprocessing of the data and includes a hyperparameter that practitioners may use to trade off accuracy and runtime. We also propose a variant of our algorithm, named BanditMIPS-$\alpha$, which employs non-uniform sampling across the data dimensions to provide further speedups.

translated by 谷歌翻译

MABSplit: Faster Forest Training Using Multi-Armed Bandits

Mo Tiwari , Ryan Kang , Je-Yong Lee , Sebastian Thrun , Chris Piech , Ilan Shomorony , Martin Jinye Zhang

分类：机器学习

2022-12-14

Random forests are some of the most widely used machine learning models today, especially in domains that necessitate interpretability. We present an algorithm that accelerates the training of random forests and other popular tree-based learning methods. At the core of our algorithm is a novel node-splitting subroutine, dubbed MABSplit, used to efficiently find split points when constructing decision trees. Our algorithm borrows techniques from the multi-armed bandit literature to judiciously determine how to allocate samples and computational power across candidate split points. We provide theoretical guarantees that MABSplit improves the sample complexity of each node split from linear to logarithmic in the number of data points. In some settings, MABSplit leads to 100x faster training (an 99% reduction in training time) without any decrease in generalization performance. We demonstrate similar speedups when MABSplit is used across a variety of forest-based variants, such as Extremely Random Forests and Random Patches. We also show our algorithm can be used in both classification and regression tasks. Finally, we show that MABSplit outperforms existing methods in generalization performance and feature importance calculations under a fixed computational budget. All of our experimental results are reproducible via a one-line script at https://github.com/ThrunGroup/FastForest.

translated by 谷歌翻译

OAMixer: Object-aware Mixing Layer for Vision Transformers

Hyunwoo Kang , Sangwoo Mo , Jinwoo Shin

分类：计算机视觉 | 机器学习

2022-12-13

Patch-based models, e.g., Vision Transformers (ViTs) and Mixers, have shown impressive results on various visual recognition tasks, alternating classic convolutional networks. While the initial patch-based models (ViTs) treated all patches equally, recent studies reveal that incorporating inductive bias like spatiality benefits the representations. However, most prior works solely focused on the location of patches, overlooking the scene structure of images. Thus, we aim to further guide the interaction of patches using the object information. Specifically, we propose OAMixer (object-aware mixing layer), which calibrates the patch mixing layers of patch-based models based on the object labels. Here, we obtain the object labels in unsupervised or weakly-supervised manners, i.e., no additional human-annotating cost is necessary. Using the object labels, OAMixer computes a reweighting mask with a learnable scale parameter that intensifies the interaction of patches containing similar objects and applies the mask to the patch mixing layers. By learning an object-centric representation, we demonstrate that OAMixer improves the classification accuracy and background robustness of various patch-based models, including ViTs, MLP-Mixers, and ConvMixers. Moreover, we show that OAMixer enhances various downstream tasks, including large-scale classification, self-supervised learning, and multi-object recognition, verifying the generic applicability of OAMixer

translated by 谷歌翻译

Breaking the Spurious Causality of Conditional Generation via Fairness Intervention with Corrective Sampling

Junhyun Nam , Sangwoo Mo , Jaeho Lee , Jinwoo Shin

分类：计算机视觉 | 人工智能 | 机器学习

2022-12-05

Trying to capture the sample-label relationship, conditional generative models often end up inheriting the spurious correlation in the training dataset, giving label-conditional distributions that are severely imbalanced in another latent attribute. To mitigate such undesirable correlations engraved into generative models, which we call spurious causality, we propose a general two-step strategy. (a) Fairness Intervention (FI): Emphasize the minority samples that are hard to be generated due to the spurious correlation in the training dataset. (b) Corrective Sampling (CS): Filter the generated samples explicitly to follow the desired label-conditional latent attribute distribution. We design the fairness intervention for various degrees of supervision on the spurious attribute, including unsupervised, weakly-supervised, and semi-supervised scenarios. Our experimental results show that the proposed FICS can successfully resolve the spurious correlation in generated samples on various datasets.

translated by 谷歌翻译

Slack-based tunable damping leads to a trade-off between robustness and efficiency in legged locomotion

An Mo , Fabio Izzi , Emre Cemal Gönen , Daniel Haeufle , Alexander Badri-Spröwitz

分类：机器人

2022-12-01

Animals run robustly in diverse terrain. This locomotion robustness is puzzling because axon conduction velocity is limited to a few ten meters per second. If reflex loops deliver sensory information with significant delays, one would expect a destabilizing effect on sensorimotor control. Hence, an alternative explanation describes a hierarchical structure of low-level adaptive mechanics and high-level sensorimotor control to help mitigate the effects of transmission delays. Motivated by the concept of an adaptive mechanism triggering an immediate response, we developed a tunable physical damper system. Our mechanism combines a tendon with adjustable slackness connected to a physical damper. The slack damper allows adjustment of damping force, onset timing, effective stroke, and energy dissipation. We characterize the slack damper mechanism mounted to a legged robot controlled in open-loop mode. The robot hops vertically and planar over varying terrains and perturbations. During forward hopping, slack-based damping improves faster perturbation recovery (up to 170%) at higher energetic cost (27%). The tunable slack mechanism auto-engages the damper during perturbations, leading to a perturbation-trigger damping, improving robustness at minimum energetic cost. With the results from the slack damper mechanism, we propose a new functional interpretation of animals' redundant muscle tendons as tunable dampers.

translated by 谷歌翻译